WAD-CMSN: Wasserstein distance-based cross-modal semantic network for zero-shot sketch-based image retrieval

نویسندگان

چکیده

Zero-shot sketch-based image retrieval (ZSSBIR) aims at retrieving natural images given free hand-drawn sketches that may not appear during training. Previous approaches used semantic aligned sketch-image pairs or utilized memory expensive fusion layer for projecting the visual information to a low-dimensional subspace, which ignores significant heterogeneous cross-domain discrepancy between highly abstract sketch and relevant image. This yield poor performance in training phase. To tackle this issue overcome drawback, we propose Wasserstein distance-based cross-modal network (WAD-CMSN) ZSSBIR. Specifically, it first projects of each branch (sketch, image) common subspace via distance an adversarial manner. Furthermore, novel identity matching loss is employed select useful features, can only capture complete knowledge, but also alleviate over-fitting phenomenon caused by WAD-CMSN model. Experimental results on challenging Sketchy (Extended) TU-Berlin datasets indicate effectiveness proposed model over several competitors.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Zero-Shot Sketch-Image Hashing

Recent studies show that large-scale sketch-based image retrieval (SBIR) can be efficiently tackled by cross-modal binary representation learning methods, where Hamming distance matching significantly speeds up the process of similarity search. Providing training and test data subjected to a fixed set of pre-defined categories, the cutting-edge SBIR and cross-modal hashing works obtain acceptab...

متن کامل

A Radon-based Convolutional Neural Network for Medical Image Retrieval

Image classification and retrieval systems have gained more attention because of easier access to high-tech medical imaging. However, the lack of availability of large-scaled balanced labelled data in medicine is still a challenge. Simplicity, practicality, efficiency, and effectiveness are the main targets in medical domain. To achieve these goals, Radon transformation, which is a well-known t...

متن کامل

Attribute-Guided Network for Cross-Modal Zero-Shot Hashing

Zero-Shot Hashing aims at learning a hashing model that is trained only by instances from seen categories but can generate well to those of unseen categories. Typically, it is achieved by utilizing a semantic embedding space to transfer knowledge from seen domain to unseen domain. Existing efforts mainly focus on single-modal retrieval task, especially Image-Based Image Retrieval (IBIR). Howeve...

متن کامل

Sketch Based Image Retrieval

Sketch based image retrieval is a task that has been explored a lot recently as an alternative method for image retrieval. We develop this task on The Sketchy Database, where we use Siamese and Triplet network to perform sketch based image retrieval. We employ deep residual learning network as the constituent network in the Siamese and Triplet architecture and use new data augmentation techniqu...

متن کامل

Sketch Based Image Retrieval

The content based image retrieval (CBIR) is one of the most common, increasing research areas of the digital image processing. Most of the existing image search tools, such as Google Images as well as Yahoo! Image search, are built on textual annotation of images. In these tools, images are physically annotated with keywords and then retrieved using text-based search methods. The presentations ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Wavelets, Multiresolution and Information Processing

سال: 2022

ISSN: ['0219-6913', '1793-690X']

DOI: https://doi.org/10.1142/s0219691322500540